224 research outputs found
A Unified Model for Shared-Memory and Message-Passing Systems
A unified model of distributed systems that accomodates both shared-memory and message-passing communication is proposed. An extension of the I/O automaton model of Lynch and Tuttle, the model provides a full range of types of atomic accesses to shared memory, from basic reads and writes to read-modify-write. In addition to supporting the specification and verification of shared memory algorithms, the unified model is particularly helpful for proving correspondences between atomic shared objects and invocation-response systems and for proving the correctness of systems that contain both message passing and shared memory (such as a network of shared-memory multiprocessors or a distributed memory multiprocessor with multi-threaded nodes). As an illustration of the model, we consider distributed systems in which the shared objects have the linearizability property proposed by Herlihy and Wing. We use the model to construct a careful proof that invocation-response systems constructed from linearizable objects simulate atomic shared memory systems. In addition, we extend the work of Herlihy and Wing by treating not only safety properties of invocation-response systems, but also liveness properties
QFAST: Conflating Search and Numerical Optimization for Scalable Quantum Circuit Synthesis
We present a quantum synthesis algorithm designed to produce short circuits
and to scale well in practice. The main contribution is a novel representation
of circuits able to encode placement and topology using generic "gates", which
allows the QFAST algorithm to replace expensive searches over circuit
structures with few steps of numerical optimization. When compared against
optimal depth, search based state-of-the-art techniques, QFAST produces
comparable results: 1.19x longer circuits up to four qubits, with an increase
in compilation speed of 3.6x. In addition, QFAST scales up to seven qubits.
When compared with the state-of-the-art "rule" based decomposition techniques
in Qiskit, QFAST produces circuits shorter by up to two orders of magnitude
(331x), albeit 5.6x slower. We also demonstrate the composability with other
techniques and the tunability of our formulation in terms of circuit depth and
running time
Accelerating Science: A Computing Research Agenda
The emergence of "big data" offers unprecedented opportunities for not only
accelerating scientific advances but also enabling new modes of discovery.
Scientific progress in many disciplines is increasingly enabled by our ability
to examine natural phenomena through the computational lens, i.e., using
algorithmic or information processing abstractions of the underlying processes;
and our ability to acquire, share, integrate and analyze disparate types of
data. However, there is a huge gap between our ability to acquire, store, and
process data and our ability to make effective use of the data to advance
discovery. Despite successful automation of routine aspects of data management
and analytics, most elements of the scientific process currently require
considerable human expertise and effort. Accelerating science to keep pace with
the rate of data acquisition and data processing calls for the development of
algorithmic or information processing abstractions, coupled with formal methods
and tools for modeling and simulation of natural processes as well as major
innovations in cognitive tools for scientists, i.e., computational tools that
leverage and extend the reach of human intellect, and partner with humans on a
broad range of tasks in scientific discovery (e.g., identifying, prioritizing
formulating questions, designing, prioritizing and executing experiments
designed to answer a chosen question, drawing inferences and evaluating the
results, and formulating new questions, in a closed-loop fashion). This calls
for concerted research agenda aimed at: Development, analysis, integration,
sharing, and simulation of algorithmic or information processing abstractions
of natural processes, coupled with formal methods and tools for their analyses
and simulation; Innovations in cognitive tools that augment and extend human
intellect and partner with humans in all aspects of science.Comment: Computing Community Consortium (CCC) white paper, 17 page
Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation
Across a variety of scientific disciplines, sparse inverse covariance
estimation is a popular tool for capturing the underlying dependency
relationships in multivariate data. Unfortunately, most estimators are not
scalable enough to handle the sizes of modern high-dimensional data sets (often
on the order of terabytes), and assume Gaussian samples. To address these
deficiencies, we introduce HP-CONCORD, a highly scalable optimization method
for estimating a sparse inverse covariance matrix based on a regularized
pseudolikelihood framework, without assuming Gaussianity. Our parallel proximal
gradient method uses a novel communication-avoiding linear algebra algorithm
and runs across a multi-node cluster with up to 1k nodes (24k cores), achieving
parallel scalability on problems with up to ~819 billion parameters (1.28
million dimensions); even on a single node, HP-CONCORD demonstrates
scalability, outperforming a state-of-the-art method. We also use HP-CONCORD to
estimate the underlying dependency structure of the brain from fMRI data, and
use the result to identify functional regions automatically. The results show
good agreement with a clustering from the neuroscience literature.Comment: Main paper: 15 pages, appendix: 24 page
Extreme Scale De Novo Metagenome Assembly
Metagenome assembly is the process of transforming a set of short,
overlapping, and potentially erroneous DNA segments from environmental samples
into the accurate representation of the underlying microbiomes's genomes.
State-of-the-art tools require big shared memory machines and cannot handle
contemporary metagenome datasets that exceed Terabytes in size. In this paper,
we introduce the MetaHipMer pipeline, a high-quality and high-performance
metagenome assembler that employs an iterative de Bruijn graph approach.
MetaHipMer leverages a specialized scaffolding algorithm that produces long
scaffolds and accommodates the idiosyncrasies of metagenomes. MetaHipMer is
end-to-end parallelized using the Unified Parallel C language and therefore can
run seamlessly on shared and distributed-memory systems. Experimental results
show that MetaHipMer matches or outperforms the state-of-the-art tools in terms
of accuracy. Moreover, MetaHipMer scales efficiently to large concurrencies and
is able to assemble previously intractable grand challenge metagenomes. We
demonstrate the unprecedented capability of MetaHipMer by computing the first
full assembly of the Twitchell Wetlands dataset, consisting of 7.5 billion
reads - size 2.6 TBytes.Comment: Accepted to SC1
diBELLA: Distributed Long Read to Long Read Alignment
We present a parallel algorithm and scalable implementation for genome
analysis, specifically the problem of finding overlaps and alignments for data
from "third generation" long read sequencers. While long sequences of DNA offer
enormous advantages for biological analysis and insight, current long read
sequencing instruments have high error rates and therefore require different
approaches to analysis than their short read counterparts. Our work focuses on
an efficient distributed-memory parallelization of an accurate single-node
algorithm for overlapping and aligning long reads. We achieve scalability of
this irregular algorithm by addressing the competing issues of increasing
parallelism, minimizing communication, constraining the memory footprint, and
ensuring good load balance. The resulting application, diBELLA, is the first
distributed memory overlapper and aligner specifically designed for long reads
and parallel scalability. We describe and present analyses for high level
design trade-offs and conduct an extensive empirical analysis that compares
performance characteristics across state-of-the-art HPC systems as well as a
commercial cloud architectures, highlighting the advantages of state-of-the-art
network technologies.Comment: This is the authors' preprint of the article that appears in the
proceedings of ICPP 2019, the 48th International Conference on Parallel
Processin
- …